Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 11 de 11
Filter
Add more filters










Publication year range
1.
Educ Psychol Meas ; 83(3): 473-494, 2023 Jun.
Article in English | MEDLINE | ID: mdl-37187694

ABSTRACT

As researchers in the social sciences, we are often interested in studying not directly observable constructs through assessments and questionnaires. But even in a well-designed and well-implemented study, rapid-guessing behavior may occur. Under rapid-guessing behavior, a task is skimmed shortly but not read and engaged with in-depth. Hence, a response given under rapid-guessing behavior does bias constructs and relations of interest. Bias also appears reasonable for latent speed estimates obtained under rapid-guessing behavior, as well as the identified relation between speed and ability. This bias seems especially problematic considering that the relation between speed and ability has been shown to be able to improve precision in ability estimation. For this reason, we investigate if and how responses and response times obtained under rapid-guessing behavior affect the identified speed-ability relation and the precision of ability estimates in a joint model of speed and ability. Therefore, the study presents an empirical application that highlights a specific methodological problem resulting from rapid-guessing behavior. Here, we could show that different (non-)treatments of rapid guessing can lead to different conclusions about the underlying speed-ability relation. Furthermore, different rapid-guessing treatments led to wildly different conclusions about gains in precision through joint modeling. The results show the importance of taking rapid guessing into account when the psychometric use of response times is of interest.

2.
World J Biol Psychiatry ; 24(10): 924-935, 2023 Dec.
Article in English | MEDLINE | ID: mdl-35175174

ABSTRACT

Objectives. Evaluate the block-adaptive number series task of reasoning, as a time-efficient proxy of general cognitive ability in the Level-2 sample of the German National Cohort (NAKO), a population-based mega cohort.Methods. The number series task consisted of two blocks of three items each, administered as part of the touchscreen-based assessment. Based on performance on the first three items, a second block of appropriate difficulty was automatically administered. Scoring of performance was based on the Rasch model. Relations of performance scores to age, sex, education, study centre, language proficiency, and scores on other cognitive tasks were examined.Results. Except for one very difficult item, the data of the remaining 14 items showed sufficient fit to the Rasch model (Infit: 0.89-1.04; Outfit: 0.80-1.08). The resulting performance scores (N = 21,056) had a distribution that was truncated at very high levels of ability. The reliability of the performance estimates was satisfactory. Relations to age, sex, education, and the executive function factor of the other cognitive tasks in the NAKO supported the validity.Conclusions. The number series task provides a valid proxy of general cognitive ability for the Level-2 sample of the NAKO, based on a highly time-efficient assessment procedure.


Subject(s)
Cognition , Language , Humans , Reproducibility of Results , Psychometrics , Surveys and Questionnaires
3.
Front Psychol ; 13: 954532, 2022.
Article in English | MEDLINE | ID: mdl-36405144

ABSTRACT

In large-scale assessments, disengaged participants might rapidly guess on items or skip items, which can affect the score interpretation's validity. This study analyzes data from a linear computer-based assessment to evaluate a micro-intervention that blocked the possibility to respond for 2 s. The blocked response was implemented to prevent participants from accidental navigation and as a naive attempt to prevent rapid guesses and rapid omissions. The response process was analyzed by interpreting log event sequences within a finite-state machine approach. Responses were assigned to different response classes based on the event sequence. Additionally, post hoc methods for detecting rapid responses based on response time thresholds were applied to validate the classification. Rapid guesses and rapid omissions could be distinguished from accidental clicks by the log events following the micro-intervention. Results showed that the blocked response interfered with rapid responses but hardly led to behavioral changes. However, the blocked response could improve the post hoc detection of rapid responding by identifying responses that narrowly exceed time-bound thresholds. In an assessment context, it is desirable to prevent participants from accidentally skipping items, which in itself may lead to an increasing popularity of initially blocking responses. If, however, data from those assessments is analyzed for rapid responses, additional log data information should be considered.

5.
Psychometrika ; 87(2): 593-619, 2022 06.
Article in English | MEDLINE | ID: mdl-34855118

ABSTRACT

Careless and insufficient effort responding (C/IER) can pose a major threat to data quality and, as such, to validity of inferences drawn from questionnaire data. A rich body of methods aiming at its detection has been developed. Most of these methods can detect only specific types of C/IER patterns. However, typically different types of C/IER patterns occur within one data set and need to be accounted for. We present a model-based approach for detecting manifold manifestations of C/IER at once. This is achieved by leveraging response time (RT) information available from computer-administered questionnaires and integrating theoretical considerations on C/IER with recent psychometric modeling approaches. The approach a) takes the specifics of attentive response behavior on questionnaires into account by incorporating the distance-difficulty hypothesis, b) allows for attentiveness to vary on the screen-by-respondent level, c) allows for respondents with different trait and speed levels to differ in their attentiveness, and d) at once deals with various response patterns arising from C/IER. The approach makes use of item-level RTs. An adapted version for aggregated RTs is presented that supports screening for C/IER behavior on the respondent level. Parameter recovery is investigated in a simulation study. The approach is illustrated in an empirical example, comparing different RT measures and contrasting the proposed model-based procedure against indicator-based multiple-hurdle approaches.


Subject(s)
Psychometrics , Computer Simulation , Psychometrics/methods , Reaction Time , Self Report , Surveys and Questionnaires
6.
Front Psychol ; 11: 562450, 2020.
Article in English | MEDLINE | ID: mdl-33192832

ABSTRACT

The digital revolution has made a multitude of text documents from highly diverse perspectives on almost any topic easily available. Accordingly, the ability to integrate and evaluate information from different sources, known as multiple document comprehension, has become increasingly important. Because multiple document comprehension requires the integration of content and source information across texts, it is assumed to exceed the demands of single text comprehension due to the inclusion of two additional mental representations: the integrated situation model and the intertext model. To date, there is little empirical evidence on commonalities and differences between single text and multiple document comprehension. Although the relationships between single text and multiple document comprehension can be well distinguished conceptually, there is a lack of empirical studies supporting these assumptions. Therefore, we investigated the dimensional structure of single text and multiple document comprehension with similar test setups. We examined commonalities and differences between the two forms of text comprehension in terms of their relations to final school exam grades, level of university studies and university performance. Using a sample of n = 501 students from two German universities, we jointly modeled single text and multiple document comprehension and applied a series of regression models. Concerning the relationship between single text and multiple document comprehension, confirmatory dimensionality analyses revealed the best fit for a model with two separate factors (latent correlation: 0.84) compared to a two-dimensional model with cross-loadings and fixed covariance between the latent factors and a model with a general factor. Accordingly, the results indicate that single text and multiple document comprehension are separable yet correlated constructs. Furthermore, we found that final school exam grades, level of university studies and prior university performance statistically significant predicted both single text and multiple document comprehension and that expected future university performance was predicted by multiple document comprehension. There were also statistically significant relationships between multiple document comprehension and these variables when single text comprehension was taken into account. The results imply that multiple document comprehension is a construct that is closely related to single text comprehension yet empirically differs from it.

7.
Front Psychol ; 11: 884, 2020.
Article in English | MEDLINE | ID: mdl-32528352

ABSTRACT

International large-scale assessments, such as the Program for International Student Assessment (PISA), are conducted to provide information on the effectiveness of education systems. In PISA, the target population of 15-year-old students is assessed every 3 years. Trends show whether competencies have changed in the countries between PISA cycles. In order to provide valid trend estimates, it is desirable to retain the same test conditions and statistical methods in all PISA cycles. In PISA 2015, however, the test mode changed from paper-based to computer-based tests, and the scaling method was changed. In this paper, we investigate the effects of these changes on trend estimation in PISA using German data from all PISA cycles (2000-2015). Our findings suggest that the change from paper-based to computer-based tests could have a severe impact on trend estimation but that the change of the scaling model did not substantially change the trend estimates.

8.
Br J Educ Psychol ; 89(3): 524-537, 2019 Sep.
Article in English | MEDLINE | ID: mdl-30980396

ABSTRACT

BACKGROUND: With digital technologies, competence assessments can provide process data, such as mouse clicks with corresponding timestamps, as additional information about the skills and strategies of test takers. However, in order to use variables generated from process data sensibly for educational purposes, their interpretation needs to be validated with regard to their intended meaning. AIMS: This study seeks to demonstrate how process data from an assessment of multiple document comprehension can be used to represent sourcing, which summarizes activities for the consideration of the origin and intention of documents. The investigated process variables were created according to theoretical assumptions about sourcing, and systematically tested for differences between persons, units (i.e., documents and items), and properties of the test administration. SAMPLE: The sample included 310 German university students (79.4% female), enrolled in several bachelor's or master's programmes of the social sciences and humanities. METHODS: Regarding the hierarchical data structure, the hypotheses were analysed with generalized linear mixed models (GLMM). RESULTS: The results mostly revealed expected differences between individuals and units. However, unexpected effects of the administered order of units and documents were detected. CONCLUSIONS: The study demonstrates the theory-informed construction of process variables from log-files and an approach for empirical validation of their interpretation. The results suggest that students apply sourcing for different reasons, but also stress the need of further validation studies and refinements in the operationalization of the indicators investigated.


Subject(s)
Academic Performance , Comprehension , Students , Thinking , Universities , Adolescent , Adult , Female , Humans , Male , Young Adult
9.
Br J Math Stat Psychol ; 70(2): 238-256, 2017 May.
Article in English | MEDLINE | ID: mdl-28474772

ABSTRACT

Completing test items under multiple speed conditions avoids the performance measure being confounded with individual differences in the speed-accuracy compromise, and offers insights into the response process, that is, how response time relates to the probability of a correct response. This relation is traditionally represented by two conceptually different functions: the speed-accuracy trade-off function (SATF) across conditions relating the condition average response time to the condition average of accuracy, and the conditional accuracy function (CAF) within a condition describing accuracy conditional on response time. Using a generalized linear mixed modelling approach, we propose an item response modelling framework that is suitable for item response and response time data from experimental speed conditions. The proposed SATF and CAF model accommodates response time effects between conditions (i.e., person and item SATF slope) and within conditions (i.e., residual CAF slopes), captures person and item differences in these effects, and is suitable for measures with a strong speed component. Moreover, for a single condition a CAF model is proposed distinguishing person, item and residual CAF. The properties of the models are illustrated with an empirical example.


Subject(s)
Models, Psychological , Problem Solving , Reaction Time , Humans , Individuality , Linear Models , Probability , Reaction Time/physiology
10.
J Psychosom Res ; 75(5): 437-43, 2013 Nov.
Article in English | MEDLINE | ID: mdl-24182632

ABSTRACT

OBJECTIVE: This study conducted a simulation study for computer-adaptive testing based on the Aachen Depression Item Bank (ADIB), which was developed for the assessment of depression in persons with somatic diseases. Prior to computer-adaptive test simulation, the ADIB was newly calibrated. METHODS: Recalibration was performed in a sample of 161 patients treated for a depressive syndrome, 103 patients from cardiology, and 103 patients from otorhinolaryngology (mean age 44.1, SD=14.0; 44.7% female) and was cross-validated in a sample of 117 patients undergoing rehabilitation for cardiac diseases (mean age 58.4, SD=10.5; 24.8% women). Unidimensionality of the itembank was checked and a Rasch analysis was performed that evaluated local dependency (LD), differential item functioning (DIF), item fit and reliability. CAT-simulation was conducted with the total sample and additional simulated data. RESULTS: Recalibration resulted in a strictly unidimensional item bank with 36 items, showing good Rasch model fit (item fit residuals<|2.5|) and no DIF or LD. CAT simulation revealed that 13 items on average were necessary to estimate depression in the range of -2 and +2 logits when terminating at SE≤0.32 and 4 items if using SE≤0.50. Receiver Operating Characteristics analysis showed that θ estimates based on the CAT algorithm have good criterion validity with regard to depression diagnoses (Area Under the Curve≥.78 for all cut-off criteria). CONCLUSION: The recalibration of the ADIB succeeded and the simulation studies conducted suggest that it has good screening performance in the samples investigated and that it may reasonably add to the improvement of depression assessment.


Subject(s)
Depression/diagnosis , Depressive Disorder/diagnosis , Heart Diseases/psychology , Mental Disorders/complications , Psychometrics , Adult , Aged , Depression/etiology , Depressive Disorder/etiology , Female , Humans , Male , Mass Screening , Middle Aged , ROC Curve , Reproducibility of Results , Software , Surveys and Questionnaires
11.
Arch Phys Med Rehabil ; 94(12): 2433-2439, 2013 Dec.
Article in English | MEDLINE | ID: mdl-23880319

ABSTRACT

OBJECTIVE: To develop and evaluate a computer adaptive test for the assessment of anxiety in cardiovascular rehabilitation patients (ACAT-cardio) that tailors an optimal test for each patient and enables precise and time-effective measurement. DESIGN: Simulation study, validation study (against the anxiety subscale of the Hospital Anxiety and Depression Scale and the physical component summary scale of the 12-Item Short-Form Health Survey), and longitudinal study (beginning and end of rehabilitation). SETTING: Cardiac rehabilitation centers. PARTICIPANTS: Cardiovascular rehabilitation patients: simulation study sample (n=106; mean age, 57.8y; 25.5% women) and validation and longitudinal study sample (n=138; mean age, 58.6 and 57.9y, respectively; 16.7% and 12.1% women, respectively). INTERVENTIONS: Not applicable. MAIN OUTCOME MEASURES: Hospital Anxiety and Depression Scale, 12-Item Short-Form Health Survey, and ACAT-cardio. RESULTS: The mean number of items was 9.2 with an average processing time of 1:13 minutes when an SE ≤.50 was used as a stopping rule; with an SE ≤.32, there were 28 items and a processing time of 3:47 minutes. Validity could be confirmed via correlations between .68 and .81 concerning convergent validity (ACAT-cardio vs Hospital Anxiety and Depression Scale anxiety subscale) and correlations between -.47 and -.30 concerning discriminant validity (ACAT-cardio vs 12-Item Short-Form Health Survey physical component summary scale). Sensitivity to change was moderate to high with standardized response means between .45 and .82. CONCLUSIONS: The ACAT-cardio shows good psychometric properties and provides the opportunity for an innovative and time-effective assessment of anxiety in cardiovascular rehabilitation. A more flexible stopping rule might further improve the ACAT-cardio. Additionally, testing in other cardiovascular populations would increase generalizability.


Subject(s)
Anxiety/diagnosis , Cardiac Rehabilitation , Cardiovascular Diseases/psychology , Surveys and Questionnaires , Female , Humans , Longitudinal Studies , Male , Middle Aged , Psychiatric Status Rating Scales , Psychometrics
SELECTION OF CITATIONS
SEARCH DETAIL
...